
Microsoft CodeView and Utilities
================================
CHAPTER 4 ___ CODEVIEW EXPRESSIONS
(Second half)

4.4  Pascal Expressions

The Pascal expression evaluator uses a subset of the mist commonly used Pascal
operators.  The CodeView Pascal expression operators are listed in Table 4.9
in order of precedence.

Table 4.9  CodeView Pascal Operators
------------------------
Precedence	Operators
------------------------
(Highest)
1			-  NOT ADR ADS    (unary)
2			* / DIV MOD AND
3			+ - OR XOR
4			=  <>  <=  >=  <  >
5			:=
(Lowest)

1,3  The minus sign with precedence 1 is the unary
	minus indicating the sign of a number; the
	minus sign with precedence 3 is a binary
	minus indicating subtraction.

See the Microsoft Pascal Reference Manual to learn how Pascal operators can be
combined with identifiers and constants to form expressions.

The asterisk (*) is supported as both the multiplication and string concaten-
ation operator.

Set variables and set operations are not supported.  The colon operator (:),
which the other expression evaluators support, is not supported by the Pascal-
expression evaluator.
Enumerated constants and variables can appear in expressions when used with
the ORD, PRED, or SUCC functions listed in Table 4.10.  With the Pascal
expression evaluator, the period (.) has its normal use as a field-selection
operator, but it also has an extended use as a specifier of local variables
in parent functions.  The syntax is shown below:
routine.variable

The routine must be a high-level language routine (procedure or function), and
the variable must be a local variable within the specified routine.  The
variable cannot be a register variable.

The Pascal language has a feature known as "nested scope" that enables the
user to define routines inside of routines, in which each routine has access
to the local variable of the routine that called it.  But with the Pascal
expression evaluator for CodeView, there is no nested scope.  You must use the
period opeerator (.) to access any local variable not declared in the
currently executing routine.  For example, consider this code:

procedure test1;
    var a, b: integer;
    precedure test2;
        var m, n : integer;
        procedure test3;
            var x, y : integer;
            begin
            x := m + n + a + b;
            .
            .
            .

In the example above, the procedure test3  has access to the variables a, m,
n, and d, as well as x and y.  However, if we are in CodeView executing test3,
then variables declared outside of test3 can be accessed in CodeView commands
only with the aid of the period operator, as in:

test1.a

When a Pascal expression is used as an argument with a command that takes
multiple arguments, the expression should not have any internal spaces.  For
example, count+6 is allowed, but count + 6 may be interpreted as three
separate arguments.  Some commands (such as the Display Expression command) do
permit spaces in expressions.

4.4.1  Pascal Symbols

<*> Syntax

name

A symbol is a name that represents a register, segment address, offset
address, or a full 32-bit address.  At the Pascal source level, a symbol is a
variable name or the name of a function.  Symbols (also called identifiers)
follow the naming rules of the Pascal compiler.  Note that symbols are never
case sensitive with the Pascal expression evaluator.  If you have turned on
case sensitivity, it is turned off automatically when a symbol is used in an
expression.

In assembly language output or in output from the Examine Symbols command, the
CodeView debugger displays some symbol names in the object-code format
produced by the Microsoft Pascal Compiler.  This format includes a leading
underscore.  For example, the function   main   is displayed as _main.  Only
global labels (such as procedure names) are shown in this format.  You do not
need to include the underscore when specifying such a symbol in CodeView
commands.  Labels within library routines are sometimes displayed with a
double underscore (__chkstk).  You must use leading underscores when accessing
these labels with CodeView commands.

4.4.2  Pascal Constants

<*>  Syntax

digits		Default radix
radix#digits	Specified radix
#digits		Hexadecimal radix

Numbers used in CodeView commands represent integer constants.  These
constants are made up of octal, decimal, or hexadecimal digits, and are
entered in the current input radix.  The default for the radix for the Pascal
expression evaluator is decimal.

The Pascal expression evaluator uses the same method for accepting constants
as the FORTRAN expression evaluator.  For further information and examples,
see Section 4.2.2, "FORTRAN Constants."

4.4.3  Pascal Strings

<*> Syntax

'string'

Strings can be specified as expressions in the Pascal format.

<*> Example

>EA message 'This string is okay.'

The example uses the Enter ASCII command (EA) to enter the given string into
memory, starting at the address of the variable    message.

4.4.4  Pascal Intrinsic Functions

When entering a Pascal expression, you can use a limited number of Pascal
intrinsic functions.  The purpose of these functions is to support the use of
enumerated types, to access array bounds and to convert one type of data to
another.  The Pascal intrinsic functions recognized by the CodeView debugger
are listed in Table 4.10.  See the Microsoft Pascal Reference Guide for a
complete description of the Pascal intrinsic functions.

Table 4.10 - Pascal Intrinsic Functions Supported by the CodeView Debugger
-----------------------------------------------------------------------------
										Argument			Function
Name				Definition			Type				Type
------------------------------------------------------------------------------
BYLONG(lowrd,hiwrd)		Builds 4-byte integer	integer or word	integer4
BYWORD(lobyte,hibyte)	-  word from 2 bytes	byte				word
CHR(ord)			Data-type conversion	ordinal			char
FLOAT(integer)			Data-type conversion	integer			real
FLOAT4(integer4)		Data-type conversion	integer4			real
FLOAT8(integer)			Data-type conversion	integer4			real8
LOBYTE(int)			Returns least			integer or word	byte
				significant byte
LOWER(arr)			Lowest bound of		array			constant
				an array
ORD(enum)			Data-type conversion	enumerated value	integer
PRED(enum)			Ordinal value or		enumerated value	integer
				predecessor
SUCC(enum)			Ordinal value or		enumerated value	integer
				successor
TRUNC(real)			Truncates toward 0		real				integer
TRUNC4(real)			Truncates toward 0		real				integer4
TRUNC8(real)			Truncates toward 0		real8			integer4
UPPER(arr)			Upper bound of			array			constant
				an array

4.5  Assembly Expressions

The /ZI option, available with Version 5.0 and later of the Microsoft Macro
Assembler, provides variable size information for the CodeView debugger.  This
makes for correct evaluation of expressions derived from assembly code (except
with arrays, which are discussed later in this section).  If you have an
earlier version of the Macro Assembler, you will need to use C type casts to
get correct evaluation.

When a program assembles or when the Auto switch is on, source files with an
.ASM extension will cause CodeView to select the C expression evaluator.
However, the following options will be set differently from the the C default
options:

*	System radix is hexadecimal (not decimal).

*	Register window is on.

*	Case Sense is off.

The C expression evaluator supports the memory operators described in Section
4.8, and generally is the appropriate expression evaluator to debug assembly
with, because of its flexibility.
 However, you cannot always use the C expression evaluator to specify an
expression exactly as it would appear in assembly code.  The list below
describes the principal differences between assembler syntax and syntax used
with the C expression evaluator.

----
Note
----
The examples below present expressions, not CodeView commands.  You can
see the results of these expressions by using them as operands for the
Display Expression command (?), described in Chapter 6, "Examining Data
and Expressions."
====

In the following list, examples of assembly source code are shown in the left-
hand column.  Corresponding CodeView expressions (with the C expression evalu-
ator) are shown in the right-hand column.

1.	Register indirection

The C expression evaluator does not extend the use of brackets to reg-
isters.  To refer to the byte, word, or double word pointed to by a
register, use the BY, WO, or DW operator.

	BYTE PTR [bx]			BY bx
	WORD PTR [bp]			WO bp
	DWORD PTR [bp]			DW bp

2.	Register indirection with displacement.

To perform based, indexed, or based-index indirection with a displace-
ment, use the BY, WO, or DW operator along with addition in a complex
expression.

	BYTE PTR [di+6]		BY di+6
	BYTE PTR [si] [bp+6]	BY si+bp+6
	WORD PTR [bx] [si]		WO bx+si

3.	Taking the address of a variable.

Use the ampersand (&) to get the address of a variable with the C
expression evaluator.

	OFFSET var			&var

4.	The PTR operator.

With the CodeView debugger, C type casts perform the same function as
the assembler PTR operator.

	BYTE PTR bar			(char) var
	WORD PTR var			(int) var
	DWORD PTR var			(long) var

5.	Accessing array elements.

Accessing arrays declared in assembly code is problematic, because the
Macro Assembler emits no type information to indicate which variables
are arrays.  Therefore the CodeView debugger treats an array name like
any other variable.

In C, an array name is equated with the address of the first element.
Therefore, if you prefix an array with the address operator (&), the
C expression evaluator gives correct results for array operations.

	string[12]				(&string) var
	warray[bx+di]				(&warray) (bx+di)/2
	darray[4]					(&darray) [1]

In the second and third examples above, notice that the indexes used
in the assembly source-code expressions differ from the indexes used
in the CodeView expressions.  This difference is necessary because C
arrays are automatically scaled according to the size of elements.
In assembly, the program must do the scaling.

4.6  Line Numbers

Line numbers are useful for source-level debugging.  They correspond to the
lines in source-code files (BASIC, C, FORTRAN, or Macro Assembler).  In source
mode, you see a program displayed with each line numbered sequentially.  The
CodeView debugger allows you to use these same numbers to access parts of a
program.

<*> Syntax

.[filename:]linenumber

The address corresponding to a source-line number can be specified as
linenumber prefixed with a period (.).  The CodeView debugger assumes that the
source line is in the current source file, unless you specify the optional
filename followed by a colon and the line number.

The CodeView debugger displays an error message if filename does not exist, or
if no source line exists for the specified number.

<*> Examples

>V .100

The example above uses the View command (V) to display code starting at the
source line 100.  Since no file is indicated, the current source file is
assumed.

>V .SAMPLE.FOR:10

>V .EXAMPLE.BAS:22

>V .DEMO.C:301

The examples above use V to display source code starting at line 10 of
SAMPLE.FOR, line 22 of EXAMPLE.BAS, and line 301 of DEMO.C, respectively.

4.7  Registers and Addresses
This section presents alternative ways to refer to objects in memory,
including values stored in the processor's registers.  Addresses are basic to
each of the expression evaluators.  A data symbol represents an address in a
data segment; a procedure name represents an address in a code segment.  All
of the syntax in this section can be considered as an extension to the BASIC,
C, or FORTRAN expression evaluator.

4.7.1  Registers

<*> Syntax

[@]register

You can specify a register name if you want to use the current value stored in
the register.  Registers are rarely needed in source-level debugging, but they
are used frequently for assembly language debugging.

When you specify an identifier, the CodeView debugger first checks the symbol
table for a symbol with that name.  If the debugger does not find a symbol, it
checks to see if the identifier is a valid register name.  If you want the
identifier to be considered a register, regardless of any name in the symbol
table, use the "at" sign (@) as a prefix to the register name.  For example,
if your program has a symbol called AX, you could specify @AX to refer to the
AX register.  You can avoid this problem entirely by making sure that
identifier names in your program do not conflict with register names.

The register names known to the CodeView debugger are shown in Table 4.11.
Note that the 32-bit registers are available only if the 386 option is on and
if the computer is a 386 machine running in 386 mode.

Table 4.11  -  Registers
------------------------------------------
Type			Names
------------------------------------------
8-bit high byte		AH   BH   CH   DH
8-bit low byte		AL   BL   CL   DL
16-bit general purpose	AX   BX   CX   DX
16-bit segment		CS   DS   SS   ES
16-bit pointer		SP   BP   IP
16-bit index		SI   DI
32-bit general purpose	EAX  EBX  ECX  EDX
32-bit pointer		ESP  EBP
32-bit index		ESI  EDI


4.7.2  Addresses

<*> Syntax

[segments]offset

Addresses can be specified in the CodeView debugger through the use of the
colon operator as a segment:offset connector.  Both the segment and toffset
are made up of expressions.
A full address has a segment and an offset, separated by a colon.  A partial
address has just an offset; a default segment is assumed.  The default segment
varies, depending on the command with which the address is used.  Commands
that refer to data (Dump, Enter, Watch, and Tracepoint) use the contents of
the DS register.  Commands that refer to code (Assemble, Breakpoint, Set, Go,
Unassemble, Breakpoint Set, Go, Unassemble, and View) use the contents of the
CS register.

Full addresses are seldom necessary in  source-level debugging.  Occasionally
they may be convenient for referring to addresses outside the program, such as
BIOS (basic input/output system) or DOS addresses.

<*> Examples

>DB 100

In the example above, the Dump Bytes command (DB) is used to dump memory
starting at offset address 100.  Since no segment is given, the data segment
(the default for Dump commands) is assumed.

>DB ARRAY(COUNT)         ;* FORTRAN/BASIC example

In the example above, the Dump Bytes command is used to dump memory starting
at the address of the variable ARRAY(COUNT).  In C, a similar variable might
be denoted as array[count].

>DB label+10

In the example above, the Dump Bytes command is used to dump memory starting
at a point 10 bytes beyond the symbol   label.

>DB ES:200

In the example above, the Dump Bytes command is used to dump memory at the
address having the segment value stored in ES and the offset address 200.

4.7.3  Address Ranges

<*> Syntax

startaddress endaddress
startaddress L count

A range is a painr of memory addresses that bound a sequence jof contiguous
memory locations.

You can specify a range in two ways.  One way is to give the start and end
points.  In this case the range covers startaddress to endaddress,
inclusively.  If a command takes a range, but you do not supply a second
address, the CodeView debugger usually assumes the default range.  Each
command has its own default range. (The most common default range is 128
bytes.)

You can also specify a range by giving its starting point and the number of
objects you want included in the range.  This type of range is called an
object range.  In specifying an object range, startaddress is the address of
the  first object in the list, L indicates that this is an object range rather
than an ordinary range, and count specifies the number of objects in the
range.

The size of the objects is the size taken by the command.  For example, the
Dump Bytes command (DB) has byte objects, the Dump Words command (DW) has
words, the Unassemble command (U) has instructions, and so on.

<*> Examples

>DB buffer

The example above dumps a range of memory starting at the symbol buffer.
Since the end of the range is not given, the default size (128 bytes for the
Dump Bytes command) is assumed.

>DB buffer buffer+20

The example above dumps a range of memory starting at buffer and ending at
buffer+20 (the point 20 bytes beyond buffer).

>DB buffer L 20

The example above uses an object range to dump the same range as in the
previous example.  The L indicates that the range is an object range, and 20
is the number of objects in the range.  Each object has a size of 1 byte,
since that is the command size.

>U funcname-30 funcname

The example above uses the Unassemble command (U) to list the assembly
language statements starting 30 instructions before funcname and continuing to
funcname.

4.8  Memory Operators

Memory operators return the content of specific locations in memory.  They are
unary operators that work in the same way regardless of the language selected,
and return the result of a direct memory operation.  They are chiefly of
interest to porgrammers who debug in assembly mode, and are not necessary for
high-level debugging.

All of the operators listed in this section are part of the CodeView C
expression evaluator and shouold not be confused with CodeView commands.  As
operators, they can only build expressions, which in turn are used as
arguments in commands.
----
Note
----
The memory operators discussed in this section are only available with the
C expression evaluator, and have lowest precedence of any C operators.
====

4.8.1  Accessing Bytes (BY)

You can access the byte at an address by using the BY operator.  This operator
is useful for simulating the BYTE PTR operation of the Microsoft Macro
Assembler.  It is particularly useful for watching the byte pointed to by a
particular register.
----
Note
----
The examples that follow in Section 4.8 make use of the Display Expression
(?) Command, which is described in Section 6.1.  The x format specifier
causes the debugger to produce output in hexadecimal.
====

<*> Syntax

BY address

The result is a short integer that contains the value of the first byte stored
at   address.

<*> Examples

>? BY sum
101

The example above returns the first byte at the address of   sum.

>? BY bp+6
42

This example returns the byte pointed to by the BP register, with a displace-
ment of 6.

4.8.2  Accessing Words (WO)

You can access the word at an address by using the WO operator.  This operator
is useful for simulating the WORD PTR operation of the assembler.  It is
particularly useful for watching the word pointed to by a particular register,
such as the stack pointer.

<*> Syntax

WO address 

The result is a short integer that contains the value of the first two bytes
stored at address.

<*> Examples

>? WO sum
>13120

The example above returns the first word at the address of sum.
>? WO sp,x
>2F38
This example returns the word pointed to by the stack pointer; the word
therefore represents the last word pushed (the "top" of the stack).

4.8.3  Accessing Double Words (DW)

You can access the word at an address by using the DW operator.  This operator
is useful for simulating the DWORD PTR operation of the Microsoft Macro
Assembler.  It is particularly useful for watching the word pointed to by a
particular register.

<*> Syntax

DW address

The result is a long integer that contains the value of the first four bytes
stored at   address.
----
Note
----
Be careful not to confuse the DW operator with the DW command.  The
operator is only useful for building expressions; it occurs within a
CodeView command line, but never at the beginning.  The second use of DW
mentioned above, the Dump Words Command, occurs only at the beginning of a
CodeView command line.  It displays an entire range of memory (in words,
not double words) rather than returning a single result.
====

<*> Examples

>? DW sum
>132120365

The example above returns the first double word at the address of  sum.

>? DW si,x
>3F880000

This example returns the double word pointed to by the SI register.

4.9  Switching Expression Evaluators

The CodeView debugger allows you to specify a particualr expression evaluator:
BASIC, C, FORTRAN, or Pascal.  You may want to specify the expression
evaluator if you are debugging a source module that does not use the standard
extension of the source language (such as .C for C, .BAS for BASIC, etc.), or
if you want to use a feature of a different language.  For example, you  might
be debugging a C program and want to evaluate a string of binary digits.  The
FORTRAN expression evaluator accepts base 2, so you might want to switch
temporarily to the FORTRAN expression evaluator.

It is normally not necessary to specify the evaluator, even if you are debugging
a mixed-language program; the Auto selection changes the expression
evaluator for you.

<*> Mouse

To switch expression evaluators with the mouse, open the Language menu and
click the appropriate language selection.

<*> Keyboard

To switch expression evaluators with a keyboard command, press ALT+L to open
up the Language menu, use the arrow keys (or mnemonic letter) to move to the
appropriate language, then press RETURN.

<*> Dialog

To switch expression evaluators using a dialog command, enter a command line
with the syntax

USE [language]

where language is C, FORTRAN, BASIC, Pascal or Auto.  The command is not case-
sensitive, and you can enter the language name in any combination of uppercase
and lowercase letters.  Entered on a line by itself, USE displays the name of
the current expression evaluator.  The USE command always displays the name of
the current expression evaluator or the new expression evaluator (if
specified).

<*> Examples

>USE fortran
FORTRAN

The example above switches to the FORTRAN expression evaluator.

>USE
BASIC

The example above displays the name of the current expression evaluator, which
in this case happens to be BASIC.

.end of chapter.